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Abstract 

We develop a maximum relative entropy formalism to generate optimal approximations to 
probability distributions. The central results consist in (a) justifying the use of relative entropy 
as the uniquely natural criterion to select a preferred approximation from within a family of 
trial parameterized distributions, and (b) to obtain the optimal approximation by marginaliz- 
ing over parameters using the method of maximum entropy and information geometry. As an 
illustration we apply our method to simple fluids. The "exact" canonical distribution is approx- 
imated by that of a fluid of hard spheres. The proposed method first determines the preferred 
value of the hard-sphere diameter, and then obtains an optimal hard-sphere approximation by 
a suitably weighed average over different hard-sphere diameters. This leads to a considerable 
improvement in accounting for the soft-core nature of the interatomic potential. As a numerical 
demonstration, the radial distribution function and the equation of state for a Lennard- Jones 
fluid (argon) are compared with results from molecular dynamics simulations. 

Keyword: Approximation method, Maximum entropy, Marginalization, Simple fluids, Hard sphere 

approximation 
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1 Introduction 

A common problem in statistical physics is that the probability distribution functions (PDFs) are 
always too complicated for practical calculations and we need to replace them by more tractable 
approximations. A possible solution is to identify a family of trial distributions {p(x)}, where x are 
parameters characterized systems and select the member of the family that is closest to the exact 
distribution P(x). The problem, of course, is that it is not clear what one means by 'closest'. One 
could minimize 
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dx \p{x) - P(x)} 2 , (1) 
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but why this particular functional and not another? And also, why limit oneself to an approximation 
by a single member of the trial family? Why not consider a linear combination of the trial distri- 
butions, some kind of average over the trial family? But then, how should we choose the optimal 
weight assigned to each p{x)l We propose to tackle these questions using the method of Maximum 
relative Entropy (which we abbreviate as ME) and information geometry pQ. The ME method, 
which is developed in 2J-[8J, has historical roots in the earlier method of maximum entropy that 
was pioneered by E. T. Jaynes and is commonly known as MaxEnt [5J. The ME method is designed 
for updating probabilities from arbitrary priors for information in the form of arbitrary constraints 
and it includes Bayes' rule and the older MaxEnt as special cases [7], [5]. 

The purpose of this paper is to develop a ME based method to generate optimal approximations 
(brief accounts of some of the results discussed below have previously been presented in [TU] and [TT]). 
The general formalism, which is the main result of this paper, is developed in section [2] In section 
12.11 we justify the use of relative entropy as the unique and natural criterion to select the preferred 
approximation, which is labeled by some parameters. The optimal approximation is obtained in 
section \2. 2 1 bv marginalizing over the variational parameters. A suitably weighed average over the 
whole family of trial distributions with the optimal weight provides an optimal approximation. 

In the second part of the paper we demonstrate the proposed formalism by applying it to simple 
classical fluids, a well studied field in the past [12]- [15]. To approximate the behavior of simple fluids 
we chose trial distributions that describe hard spheres [H]-[T3] (section The ME formalism is 
first used (section 14. ip to select the preferred value of the hard-sphere diameter. This is equivalent 
to applying the Bogoliubov variational principle and reproduces the results obtained by Mansoori 
et al. [16) whose variational principle was justified by a very different argument. 

An advantage of the variational or the ME methods over the perturbative approaches such as 
Barker and Henderson (BH) [12] and of Weeks, Chandler and Anderson (WCA) [13] is that there 
is no need for ad hoc criteria dictating how to separate the intermolecular potential into a strong 
short range repulsion and a weak long range attraction. On the other hand, a disadvantage of the 
standard variational approach is that it fails to take the softness of the repulsive core into account. 
At high temperatures this leads to results that are inferior to the perturbative approaches. 

In the standard variational a single preferred value of the hard-sphere diameter is selected. But, 
as discussed in [7J and [TTj . in the ME method non-preferred values are not completely ruled out. 
This allows us (in section 14. 2[) to marginalize over hard sphere diameter to obtain an optimal hard- 
sphere approximation with a suitable weighting. That leads to significant improvements over the 
standard variational method. 

In section [5] we test our method by comparing its predictions for a Lennard- Jones model for 
argon with molecular dynamics simulation data ([H], [E]). We find that the ME predictions for 
thermodynamic variables and for the radial distribution function are considerable improvements over 
the standard Bogoliubov variational result, and are comparable to the perturbative results [12] [15] . 
(For a recent discussion of some of the strengths and limitations of the perturbative approach see 
[20].). Despite the shortcomings of perturbation theory it remains very popular because it provides 
quantitative insights at a much lower computational cost than dynamical simulations. For recent 
applications to the glass transition and other more complex systems see [H] - [25] . Although in this 
work the ME is not applied to such complex problems one may fully expect that the information 
theory based ME method will yield insights not only the thermodynamic behavior of complex systems 
but also about the approximation methods needed to analyze them. Finally, our conclusions and 
some remarks on further improvements are presented in section [6] 

2 General formalism of ME optimal approximations 

Consider a system with microstates labeled by q (for example, the location in phase space or perhaps 
the values of spin variables). Let the probability that the microstate lies within a particular range 
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dq be given by the intractable canonical distribution 

P(q)dq = —^dq, (2) 

where Z = J dqe~^ H ^ = f e~@ F is partition function, in which H(q) is the Hamiltonian and free 
energy F is defined through this partition function. The goal is to generate an approximation p(q) 
that is optimal in the sense that is "closest" to the "exact" distribution P(q). It includes two steps. 
The first is to select a preferred trial distribution from a family of trials p'(q\9), each member of the 
trial family being labeled by one or more parameters 9. More generally one could define the trial 
family in a non-parametric way by specifying various constraints. The second is to marginalize over 
parameters 9 to obtain an optimal distribution. 

2.1 Entropic criterion for selecting preferred tractable PDFs 

Relative entropy as the selection criterion. The selection of the preferred approximation is 
achieved by ranking the distributions p{q) according to increasing preference, a real number T>[p] 
which we call the "entropy" of p. The numbers D[p] are such that if p\ is preferred over p2, then 
> T>[p2\. Thus, by design, the "preferred" approximation p is that which maximizes the 
entropy T>\p}. 

Next we determine the functional form of T>[p\. This is the general rule that provides the criterion 
for preference; in our case it defines what we mean by the "closest" or "preferred" approximation. 
The basic strategy [3] is one of induction: (1) if a general rule exists, then it must apply to special 
cases; (2) if in a certain special case we know which is the best approximation, then this knowledge 
can be used to constrain the form of T>[p\; and finally, (3) if enough special cases are known, then 
T)[p] will be completely determined. 

The known special cases are called the "axioms" of ME and they reflect the conviction that 
whatever information was originally codified into the exact P(q) is important and should be pre- 
served. The selected trial distribution should coincide with the exact one as closely as possible and 
one should only tolerate those minimal changes that are demanded by the information that defines 
the family of trials. Three axioms and their consequences are listed below. Detailed proofs and more 
extensive comments are given in [7] and [S]. 

Axiom 1: Locality. Local information has local effects. If the constraints that define the 
trial family do not refer to a certain domain D of the variable q, then the conditional probabilities 
p(q\D) need not be revised, p(q\D) = P(q\D). The consequence of the axiom is that non-overlapping 
domains of q contribute additively to the entropy: T>[p] — J dqF(p(q)) where F is some unknown 
function. 

Axiom 2: Coordinate invariance. The ranking should not depend on the system of coor- 
dinates. The coordinates that label the points q are arbitrary; they carry no information. The 
consequence of this axiom is that T>[p] = J dqp(q)f(p(q)/m(q)) involves coordinate invariants such 
as dqp(q) and p{q)/m{q) 1 where the function m{q) is a density, and both functions m and / are, at 
this point, unknown. 

Next we make a second use of Axiom 1 (locality). When there are no constraints at all and the 
family of trials includes the exact P(q) the selected trial should coincide with P(q); that is, the best 
approximation to P(q) is P(q) itself. The consequence is that up to normalization the previously 
unknown density m(q) is the exact distribution P(q). 

Axiom 3: Consistency for independent subsystems. When a system is composed of sub- 
systems that are independent it should not matter whether the approximation procedure treats them 
separately or jointly. Specifically, if q = (91,52), and the exact distributions for the subsystems, 
Pi(qi) and i^fe), are respectively approximated by pi(qi) and f^fe), then the exact distribution 
for the whole system P± (91)^2(92) should be approximated by p\ (qi)p2 (92)- This axiom restricts the 
function / to be a logarithm. 
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The overall consequence of these axioms is that the trial approximations p(q) should be ranked 
relative to the exact P(q) according to their (relative) entropy, 

V\p\P} = -J dqp(q)log^. (3) 

The derivation has singled out the relative entropy D[p|P] as the unique functional to be used for the 
purpose of selecting a preferred approximation. Other functionals, may be useful for other purposes, 
but they are not a generalization from the simple cases described in the axioms above. 
Remark. Suppose a member of a family of trial canonical distributions p'(q\&) with Hamiltonian 
H(q\9) that are conditional probability distributions and depend on parameters 9 = {9 1 , . . . ,9 n }, 
are given by 

-f3H{q\6) 

p'{q\9)dq = dq, (4) 

where Zg = J dq e~P H ^ q \ e ^ = f e~P F >> , in which free energy Fg is also dchncd. The preferred trial is 
then selected by maximizing V[p'\P]. Substituting Eq. © and Eq. Q into Eq. (O gives, 

V[p'\P] =f3((H e -H)e-F e + F) , (5) 

where (. . .)g refers to averages computed with the distribution p(q\9). The inequality 2?[p'|P] < 0, 
can then be written as 

F<F e + (H- H e )e . (6) 

Thus, maximizing X?[p'|P] is equivalent to minimizing the quantity Fg + (H — Hg)e- This form of the 
variational principle and its use to generate approximations is well known. It is usually associated 
with the name of Bogoliubov [26] and it is the main technique to generate mean field approximations 
for discrete systems of spins on a lattice. 



2.2 Marginalization for optimal PDF 

The extent to which the preferred 9 is preferred over other values ([7], [17] ) is expressed by the 
probability of 9, p(9). The original ME problem of assigning a probability to q is now broadened 
into assigning probabilities to both q and 9. In this section we use ME again to find the preferred 
joint distribution pj(q,9) = p(9)p(q\9). Note that this is the kind of problem where the Bayesian 
interpretation of probabilities is essential. Within a frequentist interpretation it makes no sense to 
talk about p(9) or about p{q\9) because 9 is not a random variable; the value of 9 is unknown but 
it is not random. 

To proceed we must ask a question. What is the prior distribution, that is, what do we know 
about q and 9 before the trial family is specified? 

The joint prior m(q, 9) can be expressed according to the product rule as, m{q, 9) = m(q\9)m(9) , 
where m(q\9) is conditional probability of observing system in state q given parameter 9. Our goal 
is to determine the preferred pj(q,9) that is closest to the prior m(q,9) that reflects our initial 
knowledge about q and ignorance about the 9s. Initially we know nothing about 9, not even how it 
is related to q. The prior that represents this state of knowledge is a product 

m{q,6) = P(q)fi(6) . (7) 

Indeed, when m{q 1 9) is a product no correlations between 9 and q are introduced which means that 
information about q tells us nothing about 9 and vice versa. The first factor in m(q, 9) reflects 
our prior knowledge about q: the distribution for q is known to be the exact P(q). The second 
factor reflects our complete ignorance about 9: we choose fi(9) to be as uniform as possible. Our 
method applies whether 9 is a discrete or a continuous variable. When 9 is a continuous variable. 
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Then the uniform distribution p,(9) is such that makes equal volumes in 9 space equally likely. To 
define these volumes we apply method of information geometry [1] and note that distances in 9- 
space are uniquely defined because the 9s are labels on probability distributions. Cases where 9 is a 
discrete variable are simpler. The relevant entropies involve sums over 9i rather than integrals and 
the natural uniform distribution is p-{9i) — constant. In what follows we concentrate on the more 
challenging continuous case. The unique distance between 9 and 9 + d9 is given by the Fisher-Rao 
metric pQ, dl 2 = jijd9 l d9^ , where 

7 « = J dqp(q\9) — t . (8) 

Accordingly, the volume of a small region d9 is r y 1 ^ 2 (9)d9, where ^(9) is the determinant of 7^. Up 
to an irrelevant normalization, the distribution p,(9) that is uniform in 9 is given by p,{9) = r y 1 ^ 2 (9). 

The preferred approximation pj(q, 9) to the joint distribution P(g)7 1//2 (#) is then obtained max- 
imizing the entropy 

VIpjI^P] = -J dqd9 P (9) Pe (q) log ^ff^ , (9) 

by varying p(9) subject to J d9p(9) = 1. The final result for the probability that 9 lies within the 
small volume r y 1 l 2 {9)d9 is 

p (e)d6 = ~ e v ^ p ^^ 2 {9)d9, (10) 

where D[pe|P] is given in Eq. ([5]) and £ is a normalization constant. Note also that the density 
expP[pe|P] is a scalar function and the presence of the Jacobian factor r y 1 ^ 2 {9) makes Eq. (fTU|) 
manifestly invariant under changes of the coordinates 9. Eq. (fTU|) expresses the degree to which 
values of 9 away from the preferred value are ruled out; it tells us that the preferred value of 9 is 
that which maximizes the probability density expP[pg|P]. 

Finally, now that we have determined the preferred joint distribution pj(q,9) = p(9)p(q\9) we 
can marginalize 9 and use the average 



p(q) = J d9p(9)p(q\9) (11) 

as the best approximation we can construct out of the given trial family. This approximation is 
expected to be better than any individual p(q\9) for the same reason that the mean is expected to 
be a better estimator than the mode - it minimizes the variance. 

This concludes the first part of our paper. To summarize: our main results consist in the 
justification of the relative entropy Eq. (J3J) as the uniquely natural functional to select the preferred 
approximations and the derivation of a quantitative measure of the degree to which the various trials 
are preferred, Eq. (fTT)|) . The final result for the best approximation is Eq. (fTTj) . 

Next we illustrate how this ME formalism is used in a specific example: simple fluids. 

3 ME optimal hard-sphere approximation for simple fluids 

3.1 Basic features of simple fluids 

A simple fluid composed of N single atom molecules is described by the Hamiltonian 

N 2 N 

H (<&) = Y,^ + U with u = E < r ^) ■ ( 12 ) 

i=i %>j 
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where qjy — {pi,rf, i — l,...,N} and the many-body interactions are approximated by a pair 
interaction, u(ry) where = |rj — r j- 1 . The probability that the positions and momenta of the 
molecules lie within the phase space volume 

1 N 

dqN = wmU dVn (13) 



Nlh 3N 

i—1 



is given by canonical distribntion 



Pf(q N ) dq N = ±- e -P H ^ dq N , (14) 
Z f 

where Zf — J dqN e~^ H ^ qN \ For fluids dominated by pair interactions most thermodynamic quan- 
tities of interest can be written in terms of the one- and two-particle density distributions 

n(r) = (n(r)) and n ( - 2 ' ) (r 1 ,r 2 ) — (n^ (r 1 ,r 2 )) (15) 

where 

n(r)=J2 s (r-n) (16) 

i 

and 

nW(n,r 2 )= ]T S (ri~n)5(r 2 - rj ) . (17) 

i>j'(Mj) 

The two-particle correlation function, 

nW(n,r 2 ) 

g(ri,r 2 ) = (18) 
n(ri)n(r 2 ) 

measures the extent to which the structure of liquids deviates from complete randomness. If the 
fluid is homogeneous and isotropic n(r) = p = N/V and g(ri,r 2 ) — g{\r\ — r 2 \) — g(r) where p is 
the bulk density and g(r) is the radial distribution function (RDF). Then, the pressure is given by 



Nk B T 6 J dr 



d 6 rr—^g(r) , (19) 



where (3 d = l/k B T M~M- 



3.2 Hard-sphere approximation 

To account for the short-distance repulsion we consider a family of trials composed by distributions 
that describe a gas of hard spheres of diameter r<j. For each rj, the Hamiltonian is 

N 2 

H hs (q N \r d )=J2 l^+Uhs (20) 

with 

N 

Uhs = },Uhs(nj\r<i) , (21) 

i>j 

where 

/ i \ / for r>r<j 

U hs( r \ r d ) = { r (22 

nsy 1 d7 1 oo for r < r d y J 



G 



and the corresponding probability distribution is 



1 



Phs(q N \r d ) = — e-fJWwIr,). 
The partition function and the free energy Fh s (T,V, N \rd) are Zhs = J d<lN e 



-l3H hs (q N \r d ) <J|l 



(23) 



del' 



-/3F h3 (T,y,JV|r d ) 



Two objections that can be raised for choosing Phs{<lN\Td) as trials are, first, that 
they do not take the long-range interactions into account; and second, that the actual short range 
potential is not that of hard spheres. These are points to which we will return later. A third 
objection, and this is considerably more serious, is that the exact hard-sphere RDF is not known. 
However, it can be calculated within the approximation of Percus and Yevick (PY) for which there 
exists an exact analytical solution ([27], [28], [29]) which is reasonably simple and in good agreement 
with numerical simulations over an extended range of temperatures and densities, except perhaps 
at high densities. There are several successful proposals [30, to improve upon the PY RDF but 
they also represent an additional level of complication. The simpler PY RDF is sufficiently accurate 
for our current objective - to illustrate the application and study the broad features of the ME 
approach. 

The PY RDF can be written in terms of the Laplace transform of rgi ls (r |r<j ) [29] . 



G(s) 



dy yghs(yrd\rd)e sv 



sL{s) 



12r) [L(s) + M(s)e s 



(24) 



where y is a dimensionless variable y = r/rd, 



L(s) = \2r\ 



i + - 2 v 



(1 + 2t?) 



MOO = (1 
and r\ is the packing fraction, 



■qf s 3 + 6/7(1 — rf) s 2 + 



def 1 3 



N 



127/(1 + 2rj) 



with p = — . 
H V 



(25) 
(26) 



(27) 



The RDF ghs{r Vd) is obtained from the inverse transform using residues |31j . 

The equation of state can then be computed in two alternative ways, either from the "pressure" 
equation or from the "compressibility" equation but, since the result above for ghs{r\rd) is not 
exact, the two results do not agree. It has been found that better agreement with simulations and 
with virial coefficients is obtained taking an average of the two results with weights 1/3 and 2/3 
respectively. The result is the Carnahan-Starling equation of state, [T2]-[T5] 



/ PV 



1 + T] + T] 2 



\Nk B Tj hs {1 -r,y 
The free energy, derived by integrating the equation of state, is 



F hs {T,V,N\r d ) = Nk B T 

where A = {2-kT? /raksT) 1 / 2 , and the entropy is 

'dF h 



-1 + lnpA 3 



471 - 3T7 2 

(i-v) 2 



V. 



hs 



dT 



n.v 



Fhs 
~T~ 



-Nk f 



(28) 



(29) 



(30) 



It must be remembered that these expressions are not exact. They are reasonable approximations 
for all densities up to almost crystalline densities (about rj sa 0.5). However, they fail to predict the 
face-centered-cubic phase when i] is in the range from 0.5 up the close-packing value of 0.74. 
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4 The ME formalism 



4.1 Preferred hard-sphere PDF 

As discussed in section [2 the trial Phs{qN\rd) that is "closest" to the "exact" Pj{q^) is found 
by maximizing the relative entropy 2?[p|P], Eq. with p = Phs{qN\rd) given by Eq. (12"3")) and 
P = Pf(q]y) given by Eq. (|14p . According to Eq. ©, it is equivalent to minimize 



Fjj = f F hs + (U — Uhs)hs 



(31) 



over all diameters r^, where (• • • )h s is computed with Phs(lN\i"d)- Thus, the variational approxima- 
tion to the free energy is 



F (T, V, N) « F V (T, V, N \r m ) d = min F V (T, V, N\r d ). 



(32) 



where r m is the preferred diameter. 
To calculate F v use 

1 



(U - U hs ) hs = ^ I d 3 rd 3 r' nf s (r,r') [u(r - r 1 ) - u hs (r - r'\r d ) 



where n^(r,r') = (h^{r,r')) hs . But u hs (r ~ r'\r d ) = for \r — r'\ > r d while n^(r,r') = for 



(2), 



\r — r'\ < r d , therefore 
with 



(U)hB 



F v = F hs + (U) ha 
1 



Np / d A ru{r)g hs {r \r d ), 



(33) 



(34) 



(2) 

where we have assumed that the fluid is isotropic and homogeneous, n hs (r, r ) 



4 2 i( 



introduced the hard-sphere RDF 



i | \ def %J r) 
9hs{r\r d ) = ns . 



r'\), and 



(35) 



Note that the approximation does not consist of merely replacing the exact free energy F by a 
hard-sphere free energy Fh s which neglects the effects of long range attraction; F is approximated 
by Fjj(r m ) which includes attraction effects through the (U)hs term in Eq. ([33]) . This addresses the 
first of the two objections mentioned earlier: the real fluid with interactions given by u is not being 
replaced by a hard-sphere fluid. The internal energy is approximated by {H)hs = ^NksT + (U)h s 
and not by (H hs ) hs = ^Nk B T. 

To calculate (U)hs it is convenient to write it in terms of V(s), the inverse Laplace transform of 
ru(r), 



yu{yr d ) 

For example, for a Lennard- Jones potential, 



u(r) — 4e 



ds V{s)e~ sy . 



we have 



V(s) = 4e 



12 «io 
10! 



6 „4 



U 4! 



(36) 



(37) 



(38) 
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Then, using equations Eq. ([55)) and ([2~4"]) gives 



(U) hs = 12A^ / V(s)G(s). (39) 
o 

Finally, it remains to minimize Fjj in Eq. (|33[) to determine the preferred diameter r m . This is done 
numerically in an explicit example for argon in section [5j 

4.2 Marginalization for optimal hard-sphere PDF 

The ME method as pursued in the last section has led us to determine a preferred hard-sphere 
diameter. It fails to take the softness of the repulsive core into account. As discussed in section 12.21 
our best assessment of the distribution of is given by the marginal over r d , 

Phs(qN) = f / dr d Pj(q N ,r d ) = / dr d P d (r d )P hs (q N \r d ). (40) 



The corresponding best approximation to the RDF is obtained using Eq. (jl5|l . (|T8|) . and ([35)1 

9hs(r) = / dq N P hs (q N )n 2 (r)/p 2 = / dr d P d (r d )g hs (r\r d ) . (41) 



By averaging over all hard-sphere diameters we are effectively describing a soft-core potential. Since 
(jhsif) takes into account soft-core effects while <7fc s (r Vm) does not, we expect that it will lead to 
improved estimates for all other thermodynamic quantities. 

However, we should emphasize that the distribution over the hard-sphere diameters P d (r d ) is not 
being introduced in an ad hoc way in order to "fix" the variational method. The introduction of 
Pd(r d ) is mandated by the ME method (section 12. 2[) . The distribution of diameters is given by Eq. 

m 

P d (r d )dr d = e —- 7 Va (r d ) dr d = ——^l 2 (r d ) dr d , (42) 

C M7 

where V [Ph s \P] = (3 (F — Fjj), the partition functions £ and <^u are given by 

C = e? F Cu with Cu - J dr d ^ (r d ) e"^, (43) 
and the natural distance dl 2 — 'y(rd)dr d in the space of r d s is given by the Fisher-Rao metric, 

f \ vl I s (d\ogP hs {q N \r d )\ 2 
l\Td) = J dq N P hs {q N \r d ) I ^- I . (44) 

A convenient way to calculate the Fisher-Rao metric is to express it as a second derivative of the 
entropy Eq. (JSJ) of p = P hs {qN \r' d ) relative to P = P hs (qN \r d ), 

l(u)=-^V[P hs {.\r> d )\P hs {-\r d )] , (45) 

d r' d =r d 

where 

V[P hs (.\r d )\P hs (-\r d )}={3 F hs \ r J~(U hs \ r J)r' d , (46) 

a d a _ 

and (■ • • ) r / is the average over Ph s (qN\r' d ) ■ As we argued above Eq. ([33)) the expectation of the 
potential energy {Uhs { r ' d ))r' d vanishes because the product Uh s (r \r' d )gh s (r \r' d ) vanishes for both 
r < r' d and r > r' d . Similarly, (Uhs {r d )) r ' d — when r' d > r d . However, when r' d < r d the expectation 
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(Uhs ( r d)) r ' d diverges. The divergence is a consequence of the unphysical nature of the hard-sphere 
model. For more realistic continuous potentials the distance between r' d = + drd and is the 
same as the distance between r' d = r& — drd and r<j. We can then always choose r' d > and define 
-f(rd) in Eq. PS]) as the limit r' d = r<i + + . Then, using Eq. P5|) for Ff ls , we have 



7M = P 



d 2 F hs (r' d ) 



dr» 



4 + 977 - 4r? 2 
= Nnpr d — ~r- . (47) 

r' a =r d +0+ U - V) 



To summarize, the distribution of diameters Pd(rd) is given by Eq. (|42j) with Fu given by 
Equations (|33| . ([29]) . (|39| and 7 given by Eq. (|47jl . Our best approximation to the "exact" P((?jv) is 
the PhsilN) given in Eq. (|40| . However, there is a problem. Since the free energy Fjj is an extensive 
quantity, Fu oc iV, for large N the distribution Pd(rd) ~ exp — /^.Fiy is very sharply peaked at the 
preferred diameter r m . When choosing a single preferred diameter for a macroscopic fluid sample we 
find that ME confers overwhelming probability to the preferred value. This is not surprising. The 
same thing happens when we calculate the global temperature or density of a macroscopic sample 
and yet local fluctuations can be important. The question then, is whether local fluctuations are 
relevant to the particular quantities we want to calculate. We argue that they are. 

From the very definition of g(r) as the probability that given an atom at a certain place another 
atom will be found at a distance r, it is clear that g(r) refers to purely local behavior and should 
be influenced by local fluctuations. To the extent that the preferred diameter r m depends on 
temperature and density we expect that local temperature and/or density fluctuations would also 
induce local diameter fluctuations. 

For the purpose of calculating g(r) the system is effectively reduced to the small number of atoms 
N e ff in the local vicinity of the reference atom at the origin. In order to develop a systematic, fully ME 
method for the determination of the effective number of particles N e ff that are locally relevant, one 
needs to use trial probability distributions that allow inhomogeneities in the hard sphere diameters. 
Yet determination of such trial probability distributions requires further investigations. Since we 
are interested in demonstrating ME formalism for optimizing approximations in this work only, we 
consider a rather simple approach to estimate N c g. Based on our ME approach to a mean field 
approximation for fluids [10] , we have shown the RDF is given by 

g MF (r) = e -0 u ( r )-0Pf d3r ' u ( r ')(aMF(r-r')-l) _ / 4g \ 

For a sufficiently dilute gas, the results of the mean field approximation are comparable with ex- 
perimental results. The gMF (r) is approximately close to the true, exact RDF g(R) for r < cr, the 
Lennard- Jones parameter. A fluid of hard spheres gives g(r\rd) = for r < rd and cannot reproduce 
the behavior of Eq. (|48|) . However, once we recognize that we can use a statistical mixture, Eq. 
(|40|) . we can tune the size N e g of the cell and thereby change the width of Pd (rd) so that the RDF 
cjhs (r) of Eq. (|41[) reproduces the known short-distance behavior of Eq. 



5 Numerical demonstrations: Lennard-Jones "argon" 

One of the difficulties in testing theories about fluids against experimental data is that it is not easy 
to see whether discrepancies are to be blamed to a faulty approximation or to a wrong intermolecular 
potential. This is why theories are normally tested against molecular dynamics numerical simulations 
where there is control over the intermolecular potential. In this section we compare ME results 
against simulation results [18] for a fluid of monoatomic molecules interacting through a Lennard- 
Joncs potential, Eq. (I37p . The parameters e and a (the depth of the well, w| m i n = — £, and the 
radius of the repulsive core, u(a) = 0, respectively) are chosen to model argon: e = 1.03 x 10 -2 eV 
and cr = 3.405A. 
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5.1 Preliminary examinations 

The free energy Fjj. Figure [1] (A) shows the free energy F(j /NkgT as a function of hard-sphere 
diameter for argon at a fixed density of per 3 — 0.65 for different temperatures. Figure[TJ(B) shows 
Fu/NksT as a function of for several densities at fixed T — 107.82 K. Since the critical point for 
argon is at T c = 150.69 K and p c er 3 = 0.33 all these curves, except that at 300 K, lie well within the 
liquid phase. The increase of Fu/NksT for high values of ra is due to short range repulsion between 
the hard spheres described by Fhs/NksT '. The increase for low r<2 is due to the Lennard-Jones 
short-range repulsion as described by {U)} ls /NkBT . 

The preferred Td is that which minimizes Fjj and depends both on temperature and density. The 
preferred diameter decreases as the temperature increases because atoms with higher energy can 
penetrate deeper into the repulsive core. The dependence with density is less pronounced. 
The distribution of diameters Pd(rd)- In section 14.21 we argued that the effective number of 
molecules that is relevant to the local structure of the fluid is not the total number of molecules in 
the system N, but a smaller number, N cS . In Fig[5J(A) we plot the distribution of diameters Pd(Td) 
for different temperatures, for a fixed fluid density of per 3 = 0.65, and for an arbitrarily chosen 
iVcff = 13500. As expected the distribution shifts to higher diameters as the temperature decreases. 
Notice also that the distribution becomes narrower at lower temperatures in agreement with the fact 
that a hard-sphere approximation is better at low T 12J. 

Figure [U(B) shows that increasing N e g (with fixed density p) decreases the width of Pd{rd) 
(solid lines) and induces a slight shift of the whole distribution. This is due to the dependence 
~ (Acffrd) 1 / 2 of the Fisher- Rao measure 7 1 / 2 (r<j) in 27] Figure [U(B) also explores the influence of 
(^d) by comparing the actual distributions Pd(r^) (solid lines) with the distributions e -13 ^^^ 
(dotted lines) which are obtained by setting 7 1 / 2 = 1 in Eq. (|42|) . The effect of 7 1 / 2 is to shift the 
distribution slightly to higher r^. 

5.2 Two properties of argon 

The radial distribution function. We are finally ready to calculate the radial distribution g(r) 
for argon. We start by estimating the number of molecules N e g that are locally relevant; as explained 
earlier we choose N e s so that our best approximation g^s (r), Eq. (|41| . reproduces the known short- 
distance behavior guF (f), Eq. (|4"8"|) , for r <C a. We have found that the estimates for N e g need not 
be very accurate but that they must be obtained for each value of the temperature and density In 
Fig[3] we show an example of the short-distance behavior of g~h s for three values of N e g at T = 107.82 
K and per 3 = 0.65; using a Chi-square fit in the range from r = 2.9 to 3.1 A the selected best value 
of Acff is around 38000. 

In figures [4] (A)-(D) we compare three different ways to calculate the RDF. The solid line is 
Verlet's molecular dynamics simulation [18] ; it plays the role of experimental data against which we 
compare our theory. The dotted line is c//j S (r|r m ) for the hard-sphere fluid with preferred diameter 
r m . This curve, calculated from inverse of Eq. ([24]) . is also the result of the variational method and 
coincides with the ME result for a macroscopically large A c ff = N. The dashed line is the averaged 
(jhsir) of the extended ME analysis. Figures. HJ(A)-(C) were plotted at three different temperatures 
T = 107.82, 124.11 and 189.76 K at the density pa 3 = 0.65. Figure H(D) we changed the density 
and the temperature to per 3 = 0.5 and T = 162.93 K. The agreement between the ME curve and 
Verlet's data is good. The vast improvement over the simpler variational method calculation is clear. 

One might be tempted to dismiss this achievement as due to the adjustment of the parameter 
iV e ff but this is not quite correct: N c g has not been adjusted, it has been calculated by fitting g~hs(r) 
to optimal mean field RDF jMf , (f) for r < a. Indeed, despite the fact that the hard-sphere trial 
solutions that we employ are mere approximations, the functional form of the whole curve g~hs{r) in 
Eq. (|4"T|) is reproduced quite well. 

However, one may note that the agreement between the ME prediction and Verlet's data becomes 
worse when the fluid density is decreased or the temperature is increased. The reason has been spelled 
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out by in the studies of WCA [T3] They demonstrate that both the repulsive and attractive forces 
contribute to the fluid structure when fluid is at low and moderate densities (0.4 < pa 3 < 0.65). 
However, when the fluid density is high enough (per 3 > 0.65), the repulsive force becomes dominant. 
Because the hard-sphere approximation does not include the attractive force, the hard-sphere RDF 
does not take the attractive force into account, and this error propagates into our ME prediction. 
The same discrepancy is also revealed in the WCA theory for low density [T5] . 
The equation of state. Finally we use the RDF to calculate the equation of state from the pressure 
equation, Eq. p9[) . In Fig [5] we compare the equation of state derived from the g(r) obtained from 
Verlet's simulation with calculations using the EME and variational methods and the perturbative 
theories of Barker and Henderson [TJ] and of Weeks, Chandler and Anderson [TJ], at T = 161.73 
K. The EME results constitute a clear improvement over the plain variational calculation. For low 
densities all four methods agree with each other but differ from the simulation. A better agreement in 
this region would probably require a better treatment of two-particle correlations at long distances. 
At intermediate densities the best agreement is provided by the EME and BH results, while the 
WCA theory seems to be the best at high densities. Also shown in Fig [5] (A) are experimental data 
on argon [32) . The discrepancy between the experimental curve and the Verlet simulation is very 
likely due to the actual potential not being precisely of the Lennard- Jones type. 

In Fig [3(B) we plot the EME equation of state for three different isotherms (T = 137.77, 161.73 
and 328. 25K). To compare to the simulation of Hansen and Verlet [TO] we plot j3P (rather than 
f3P/ p) as a function of density per 3 because this kind of plot exhibits the characteristic van der Waals 
loop that signals the liquid-gas transition as the temperature drops. A more exhaustive exploration 
lies, however, outside the scope of this paper. 

6 Conclusion 

The goal of this paper has been to use the EME method to generate approximations and show that 
this provides a generalization of the Bogoliubov variational principle. This addresses a range of 
applications that lie beyond the scope of the traditional MaxEnt. To test the method we considered 
simple classical fluids. 

When faced with the difficulty of dealing a system described by an intractable Hamiltonian, the 
traditional approach has been to consider a similar albeit idealized system described by a simpler 
more tractable Hamiltonian. The approach we have followed here departs from this tradition: our 
goal is not to identify an approximately similar Hamiltonian but rather to identify an approximately 
similar probability distribution. The end result of the EME approach is a probability distribution 
which is a sum or an integral over distributions corresponding to different hard-sphere diameters. 
While each term in the sum is of a form that can be associated to a real hard-sphere gas, the sum 
itself is not of the form exp —j3H, and cannot be interpreted as describing any physical system. 

As far as the application to simple fluids is concerned the results achieved in this paper represent 
progress but further improvements are possible by using better approximations to the hard-sphere 
fluid and by choosing a broader family of trial distributions. An important improvement would be 
to use trial probability distributions that allow inhomogeneities in the hard sphere diameters. This 
would lead to a systematic, fully EME method for the determination of the effective number of 
particles N e g that are locally relevant. 

Many perturbative approaches to fluids had been proposed, and a gradual process of selection 
over many years of research led to the optimized theories of BH and WCA. The variational ap- 
proach was definitely less satisfactory than these "best" perturbation theories. With our work, 
however, the situation has changed: the EME-improved variational approach offers predictions that 
already are competitive with the best perturbative theories. And, of course, the potential for further 
improvements of the EME approach remains, at this early date, far from being exhausted. 
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Figure 1: (A): The free energy Fjj as a function of hard-sphere diameter for argon at a density 
of pa 3 = 0.65 for different temperatures. The best is that which minimizes Fjj. (B): Fu as a 
function of for argon at T = 107.82 K for different densities. 
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Figure 2: (A): The distribution of hard-sphere diameters r d for argon for several temperatures at 
density pa 3 = 0.65 for N cS = 13500. (B): P d (r d ) for various iV cff at T = 107.82 K and pa 3 = 0.65. 
By setting 7 1 / 2 = 1 (dotted lines) we see that the effect of the 7 1 / 2 factor is to cause a slight shift 
of the distribution. 
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Figure 3: Estimating N c g by requiring that ghs(f) have the correct short-distance behavior e 
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Figure 4: The radial distribution function for (a) the hard-sphere fluid with optimal diameter r m ; 
(b) Verlet's molecular dynamics simulation; and (c) the improved EME analysis, for argon at (A): 
density per 3 = 0.65, temperature T — 107.82 K, and effective particle number N e g = 38000. (B): 
per 3 = 0.65, T = 124.11 K, and iV cff = 40000. (C): per 3 = 0.65, T = 189.76 K, and N cS = 50000. 
(D): per 3 = 0.5, T = 162.93 K, and N cB = 62000. 
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Figure 5: (A): The argon equation of state calculated using the EME method, the variational method 
and the perturbative theories of BH and WCA are compared to the Verlet simulation at T = 161.73 
K. Also shown are Levelt's experimental results. (B): /3P versus the reduced density per 3 calculated 
using the EME method (solid line) and compared to the Hansen- Verlet simulation for three different 
isotherms. The graph shows the appearance of the liquid-gas van der Waals loop as the temperature 
drops. 
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